Goto

Collaborating Authors

 nlp algorithm


Automating Date Format Detection for Data Visualization

Liang, Zixuan

arXiv.org Artificial Intelligence

--Data preparation, specifically date parsing, is a significant bottleneck in analytic workflows. T o address this, we present two algorithms--one based on minimum entropy and the other on natural language modeling--that automatically derive date formats from string data. These algorithms achieve over 90% accuracy on a large corpus of data columns, streamlining the data preparation process within visualization environments. The minimal entropy approach is particularly fast, providing interactive feedback. Our methods simplify date format extraction, making them suitable for integration into data visualization tools and databases. Lately, the coordination of information perception advancements like Polaris [1] and Spotfire [2] has featured the significance of joining computational power with human knowledge for successful information examination. While PCs succeed at handling huge datasets, people bring significant space skill and the capacity to perceive designs visually [3], [4]. Frameworks that influence both human criticism and machine handling demonstrate additional success in separating significant experiences from information. Intuitive perception frameworks have become fundamental for empowering clients to investigate information while keeping up with their scientific stream.


Extraction of Sleep Information from Clinical Notes of Patients with Alzheimer's Disease Using Natural Language Processing

Sivarajkumar, Sonish, Tam, Thomas Yu CHow, Mohammad, Haneef Ahamed, Viggiano, Samual, Oniani, David, Visweswaran, Shyam, Wang, Yanshan

arXiv.org Artificial Intelligence

Alzheimer's Disease (AD) is the most common form of dementia in the United States. Sleep is one of the lifestyle-related factors that has been shown critical for optimal cognitive function in old age. However, there is a lack of research studying the association between sleep and AD incidence. A major bottleneck for conducting such research is that the traditional way to acquire sleep information is time-consuming, inefficient, non-scalable, and limited to patients' subjective experience. A gold standard dataset is created from manual annotation of 570 randomly sampled clinical note documents from the adSLEEP, a corpus of 192,000 de-identified clinical notes of 7,266 AD patients retrieved from the University of Pittsburgh Medical Center (UPMC). We developed a rule-based Natural Language Processing (NLP) algorithm, machine learning models, and Large Language Model(LLM)-based NLP algorithms to automate the extraction of sleep-related concepts, including snoring, napping, sleep problem, bad sleep quality, daytime sleepiness, night wakings, and sleep duration, from the gold standard dataset. Rule-based NLP algorithm achieved the best performance of F1 across all sleep-related concepts. In terms of Positive Predictive Value (PPV), rule-based NLP algorithm achieved 1.00 for daytime sleepiness and sleep duration, machine learning models: 0.95 and for napping, 0.86 for bad sleep quality and 0.90 for snoring; and LLAMA2 with finetuning achieved PPV of 0.93 for Night Wakings, 0.89 for sleep problem, and 1.00 for sleep duration. The results show that the rule-based NLP algorithm consistently achieved the best performance for all sleep concepts. This study focused on the clinical notes of patients with AD, but could be extended to general sleep information extraction for other diseases.


NLP for Knowledge Discovery and Information Extraction from Energetics Corpora

VanGessel, Francis G., Perry, Efrem, Mohan, Salil, Barham, Oliver M., Cavolowsky, Mark

arXiv.org Artificial Intelligence

The study of energetics necessarily involves numerous scientific domains, spanning shock physics and detonation science, fluid dynamics, material science, thermodynamics, and chemical synthesis. The plethora of sub-disciplines of math, physics, chemistry, and engineering pose a challenge to practitioners who would wish to amass an expertise of energetics. Furthermore, maintaining awareness of advancements in energetics research is complicated by the exponential rate at which new research is published across scientific disciplines, including energetics. Thus, the development of automated and intelligent approaches for extracting knowledge from papers, reports, textbooks, and patents related to energetics could aid researchers and accelerate progress in energetics science. Natural Language Processing (NLP) is a sub-field of linguistics, computer science, and Machine Learning (ML) involving the interactions between computers and human (natural) languages. NLP techniques are used to analyze and generate human language, allowing computers to read, interpret, and understand text and speech. In the context of energetics research, NLP can be used to analyze large volumes of textual data, such as scientific papers, technical reports, and patents, in order to extract relevant information about the concepts that underlie and explain energetics phenomenon. Furthermore, NLP can enable natural language understanding that could be further applied to text mining journal articles and performing numerous natural language tasks such as classification, summarization, and recommendation. Overall, the use of NLP in energetics research has the potential to enhance our understanding of energetic materials and phenomenon, and assist in the development novel propellants, explosives, and pyrotechnics.


From Text to Meaning: How Natural Language Processing Algorithms Work

#artificialintelligence

Natural language processing (NLP) is a field of study that combines computer science and linguistics to help machines understand human language. NLP has become an integral part of modern technology, powering everything from chatbots to voice assistants. But how exactly do NLP algorithms work? And why do they matter? At its core, NLP is about teaching machines to understand human language.


Document Provenance and Authentication through Authorship Classification

Zamir, Muhammad Tayyab, Ayub, Muhammad Asif, Khan, Jebran, Ikram, Muhammad Jawad, Ahmad, Nasir, Ahmad, Kashif

arXiv.org Artificial Intelligence

Style analysis, which is relatively a less explored topic, enables several interesting applications. For instance, it allows authors to adjust their writing style to produce a more coherent document in collaboration. Similarly, style analysis can also be used for document provenance and authentication as a primary step. In this paper, we propose an ensemble-based text-processing framework for the classification of single and multi-authored documents, which is one of the key tasks in style analysis. The proposed framework incorporates several state-of-the-art text classification algorithms including classical Machine Learning (ML) algorithms, transformers, and deep learning algorithms both individually and in merit-based late fusion. For the merit-based late fusion, we employed several weight optimization and selection methods to assign merit-based weights to the individual text classification algorithms. We also analyze the impact of the characters on the task that are usually excluded in NLP applications during pre-processing by conducting experiments on both clean and un-clean data. The proposed framework is evaluated on a large-scale benchmark dataset, significantly improving performance over the existing solutions.


Automated Fidelity Assessment for Strategy Training in Inpatient Rehabilitation using Natural Language Processing

Osterhoudt, Hunter, Schneider, Courtney E., Mohammad, Haneef A, Shih, Minmei, Harper, Alexandra E., Zhou, Leming, Skidmore, Elizabeth R, Wang, Yanshan

arXiv.org Artificial Intelligence

Strategy training is a multidisciplinary rehabilitation approach that teaches skills to reduce disability among those with cognitive impairments following a stroke. Strategy training has been shown in randomized, controlled clinical trials to be a more feasible and efficacious intervention for promoting independence than traditional rehabilitation approaches. A standardized fidelity assessment is used to measure adherence to treatment principles by examining guided and directed verbal cues in video recordings of rehabilitation sessions. Although the fidelity assessment for detecting guided and directed verbal cues is valid and feasible for single-site studies, it can become labor intensive, time consuming, and expensive in large, multi-site pragmatic trials. To address this challenge to widespread strategy training implementation, we leveraged natural language processing (NLP) techniques to automate the strategy training fidelity assessment, i.e., to automatically identify guided and directed verbal cues from video recordings of rehabilitation sessions. We developed a rule-based NLP algorithm, a long-short term memory (LSTM) model, and a bidirectional encoder representation from transformers (BERT) model for this task. The best performance was achieved by the BERT model with a 0.8075 F1-score. This BERT model was verified on an external validation dataset collected from a separate major regional health system and achieved an F1 score of 0.8259, which shows that the BERT model generalizes well. Introduction Stroke is a leading cause of disability in the United States. Meta-cognitive strategy training (henceforth referred to as strategy training) is a multidisciplinary rehabilitation approach that teaches skills to reduce disability among those with cognitive impairments following a stroke.


How Natural Language Processing (NLP) Works To Make Smart Machines Smarter

#artificialintelligence

Part of the implicit promise and marketing hype around Machine Learning is that these technologies will make life easier for humans by reading and processing information with us. The basic premise of NLP is the ability -- after training -- to recognise and classify the intent in natural text such as a blog article or speech. Natural language processing (NLP) is a branch of artificial intelligence that deals with the interaction between humans and computers using natural language. NLP is used to build applications that can understand human languages and respond in a way that is natural for humans. NLP is used to create chatbots, analyse sentiment, extract information, translate text, and more.


What is Natural Language Processing (NLP)?

#artificialintelligence

We all know that the machine is able to understand our language. But have we ever tried to understand how is that possible and which codes or software or programs are made to run through it to make it happen? This blog, therefore, is devoted to Natural Language Processing (NLP) which is behind this hyper-intelligent technology. Natural Language Processing is simply defined as the automation developed with the algorithms that make the natural language understandable by the machine. NLP algorithms convert the human subfield of linguistics into the computer language to make the interaction between the human and the machine possible.


Automated interpretation of stress echocardiography reports using natural language processing

#artificialintelligence

Cardiologists had good agreements on the overall SE results on the 140 reports: Kappa (0.83) and intraclass correlation coefficient (0.89). The NLP algorithm achieved 98.6% specificity and negative predictive value, 95.7% sensitivity, positive predictive value and F-score on the overall SE results and near-perfect scores on ischemia findings. The 30-day acute myocardial infarction or death outcomes were highest among patients with ischemia (5.0%), followed by infarction (1.4%), non-diagnostic (0.8%) and normal (0.3%) results. We found substantial variations in the format and quality of SE reports, even within the same institution.


What is Natural Language Processing (NLP), and Why is it Even Relevant

#artificialintelligence

Have you ever given any thought to how digital assistants like Siri can accurately understand our speech and return highly specific answers? Or have you ever tried to delve into the Alexa schematics just to empathize better with this virtual assistant? I bet not, and that is exactly why you need to get even more familiar with a term called NLP or Natural Language processing. NLP is one of the major AI technologies aimed at making machines capable enough to interpret speech and text-based human language. And if you are still unsure about the utilities involved, NLP forms the backbone of several common tools like chatbots, grammar checkers, translation software modules, spam filters, and even scaled search engines.